AITopics | subspace cluster

The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification

Neural Information Processing SystemsFeb-9-2025, 00:08:51 GMT

We present the Bayesian Case Model (BCM), a general framework for Bayesian case-based reasoning (CBR) and prototype classification and clustering. BCM brings the intuitive power of CBR to a Bayesian generative framework. The BCM learns prototypes, the "quintessential" observations that best represent clusters in a dataset, by performing joint inference on cluster labels, prototypes and important features. Simultaneously, BCM pursues sparsity by learning subspaces, the sets of features that play important roles in the characterization of the prototypes. The prototype and subspace representation provides quantitative benefits in interpretability while preserving classification accuracy. Human subject experiments verify statistically significant improvements to participants' understanding when using explanations produced by BCM, compared to those given by prior art.

artificial intelligence, machine learning, prototype, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Norway > Central Norway > Trøndelag > Trondheim (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.47)

Industry:

Media > Film (0.69)
Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification

Neural Information Processing SystemsMar-13-2024, 07:45:32 GMT

We present the Bayesian Case Model (BCM), a general framework for Bayesian case-based reasoning (CBR) and prototype classification and clustering. BCM brings the intuitive power of CBR to a Bayesian generative framework. The BCM learns prototypes, the "quintessential" observations that best represent clusters in a dataset, by performing joint inference on cluster labels, prototypes and important features. Simultaneously, BCM pursues sparsity by learning subspaces, the sets of features that play important roles in the characterization of the prototypes. The prototype and subspace representation provides quantitative benefits in interpretability while preserving classification accuracy. Human subject experiments verify statistically significant improvements to participants' understanding when using explanations produced by BCM, compared to those given by prior art.

bcm, dataset, prototype, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Norway > Central Norway > Trøndelag > Trondheim (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.47)

Industry:

Media > Film (0.69)
Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Creating user stereotypes for persona development from qualitative data through semi-automatic subspace clustering

Korsgaard, Dannie, Bjorner, Thomas, Sorensen, Pernille Krog, Burelli, Paolo

arXiv.org Artificial IntelligenceJun-26-2023

Personas are models of users that incorporate motivations, wishes, and objectives; These models are employed in user-centred design to help design better user experiences and have recently been employed in adaptive systems to help tailor the personalized user experience. Designing with personas involves the production of descriptions of fictitious users, which are often based on data from real users. The majority of data-driven persona development performed today is based on qualitative data from a limited set of interviewees and transformed into personas using labour-intensive manual techniques. In this study, we propose a method that employs the modelling of user stereotypes to automate part of the persona creation process and addresses the drawbacks of the existing semi-automated methods for persona development. The description of the method is accompanied by an empirical comparison with a manual technique and a semi-automated alternative (multiple correspondence analysis). The results of the comparison show that manual techniques differ between human persona designers leading to different results. The proposed algorithm provides similar results based on parameter input, but was more rigorous and will find optimal clusters, while lowering the labour associated with finding the clusters in the dataset. The output of the method also represents the largest variances in the dataset identified by the multiple correspondence analysis.

algorithm, dimension, interviewee, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s11257-019-09252-5

2306.14551

Country:

Europe > Denmark > Capital Region > Copenhagen (0.04)
Europe > Denmark > North Jutland > Aalborg (0.04)
Europe > Netherlands > Limburg > Maastricht (0.04)
(4 more...)

Genre:

Questionnaire & Opinion Survey (1.00)
Personal > Interview (0.46)
Research Report > New Finding (0.34)

Industry:

Health & Medicine > Consumer Health (1.00)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.93)
Information Technology > Human Computer Interaction > Interfaces (0.93)

Add feedback

Improving the Effectiveness and Efficiency of Stochastic Neighbour Embedding with Isolation Kernel

Zhu, Ye | Ting, Kai Ming (Nanjing University)

Journal of Artificial Intelligence ResearchAug-2-2021

This paper presents a new insight into improving the performance of Stochastic Neighbour Embedding (t-SNE) by using Isolation kernel instead of Gaussian kernel. Isolation kernel outperforms Gaussian kernel in two aspects. First, the use of Isolation kernel in t-SNE overcomes the drawback of misrepresenting some structures in the data, which often occurs when Gaussian kernel is applied in t-SNE. This is because Gaussian kernel determines each local bandwidth based on one local point only, while Isolation kernel is derived directly from the data based on space partitioning. Second, the use of Isolation kernel yields a more efficient similarity computation because data-dependent Isolation kernel has only one parameter that needs to be tuned. In contrast, the use of data-independent Gaussian kernel increases the computational cost by determining n bandwidths for a dataset of n points. As the root cause of these deficiencies in t-SNE is Gaussian kernel, we show that simply replacing Gaussian kernel with Isolation kernel in t-SNE significantly improves the quality of the final visualisation output (without creating misrepresented structures) and removes one key obstacle that prevents t-SNE from processing large datasets. Moreover, Isolation kernel enables t-SNE to deal with large-scale datasets in less runtime without trading off accuracy, unlike existing methods in speeding up t-SNE.

gaussian kernel, isolation kernel, kernel, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.12904

AI Access Foundation

12904

Journal of Artificial Intelligence Research

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Oceania > Australia > Victoria (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

A Generic Framework for Interesting Subspace Cluster Detection in Multi-attributed Networks

Chen, Feng, Zhou, Baojian, Alim, Adil, Zhao, Liang

arXiv.org Artificial IntelligenceSep-20-2018

Detection of interesting (e.g., coherent or anomalous) clusters has been studied extensively on plain or univariate networks, with various applications. Recently, algorithms have been extended to networks with multiple attributes for each node in the real-world. In a multi-attributed network, often, a cluster of nodes is only interesting for a subset (subspace) of attributes, and this type of clusters is called subspace clusters. However, in the current literature, few methods are capable of detecting subspace clusters, which involves concurrent feature selection and network cluster detection. These relevant methods are mostly heuristic-driven and customized for specific application scenarios. In this work, we present a generic and theoretical framework for detection of interesting subspace clusters in large multi-attributed networks. Specifically, we propose a subspace graph-structured matching pursuit algorithm, namely, SG-Pursuit, to address a broad class of such problems for different score functions (e.g., coherence or anomalous functions) and topology constraints (e.g., connected subgraphs and dense subgraphs). We prove that our algorithm 1) runs in nearly-linear time on the network size and the total number of attributes and 2) enjoys rigorous guarantees (geometrical convergence rate and tight error bound) analogous to those of the state-of-the-art algorithms for sparse feature selection problems and subgraph detection problems. As a case study, we specialize SG-Pursuit to optimize a number of well-known score functions for two typical tasks, including detection of coherent dense and anomalous connected subspace clusters in real-world networks. Empirical evidence demonstrates that our proposed generic algorithm SG-Pursuit performs superior over state-of-the-art methods that are designed specifically for these two tasks.

data mining, machine learning, subspace cluster, (17 more...)

arXiv.org Artificial Intelligence

1709.05246

Country: North America > United States (0.29)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.68)
Health & Medicine > Therapeutic Area > Oncology (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification

Kim, Been, Rudin, Cynthia, Shah, Julie

arXiv.org Machine LearningMar-3-2015

We present the Bayesian Case Model (BCM), a general framework for Bayesian case-based reasoning (CBR) and prototype classification and clustering. BCM brings the intuitive power of CBR to a Bayesian generative framework. The BCM learns prototypes, the "quintessential" observations that best represent clusters in a dataset, by performing joint inference on cluster labels, prototypes and important features. Simultaneously, BCM pursues sparsity by learning subspaces, the sets of features that play important roles in the characterization of the prototypes. The prototype and subspace representation provides quantitative benefits in interpretability while preserving classification accuracy. Human subject experiments verify statistically significant improvements to participants' understanding when using explanations produced by BCM, compared to those given by prior art.

artificial intelligence, machine learning, prototype, (18 more...)

arXiv.org Machine Learning

1503.01161

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Media > Film (0.69)
Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Greedy Feature Selection for Subspace Clustering

Dyer, Eva L., Sankaranarayanan, Aswin C., Baraniuk, Richard G.

arXiv.org Machine LearningJul-3-2013

Unions of subspaces provide a powerful generalization to linear subspace models for collections of high-dimensional data. To learn a union of subspaces from a collection of data, sets of signals in the collection that belong to the same subspace must be identified in order to obtain accurate estimates of the subspace structures present in the data. Recently, sparse recovery methods have been shown to provide a provable and robust strategy for exact feature selection (EFS)--recovering subsets of points from the ensemble that live in the same subspace. In parallel with recent studies of EFS with L1-minimization, in this paper, we develop sufficient conditions for EFS with a greedy method for sparse signal recovery known as orthogonal matching pursuit (OMP). Following our analysis, we provide an empirical study of feature selection strategies for signals living on unions of subspaces and characterize the gap between sparse recovery methods and nearest neighbor (NN)-based approaches. In particular, we demonstrate that sparse recovery methods provide significant advantages over NN methods and the gap between the two approaches is particularly pronounced when the sampling of subspaces in the dataset is sparse. Our results suggest that OMP may be employed to reliably recover exact feature sets in a number of regimes where NN approaches fail to reveal the subspace membership of points in the ensemble.

artificial intelligence, machine learning, subspace, (15 more...)

arXiv.org Machine Learning

1303.4778

Country:

North America > United States > Texas (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Oil & Gas (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Incorporating Unsupervised Learning in Activity Recognition

Li, Fei (Vienna University of Technology) | Dustdar, Schahram (Vienna University of Technology)

AAAI ConferencesAug-8-2011

Users are constantly involved in a multitude of activities in ever-changing context. Analyzing activities in context-rich environments has become a great challenge in context-awareness research. Traditional methods for activity recognition, such as classification, cannot cope with the variety and dynamicity of context and activities. In this paper, we propose an activity recognition approach that incorporates unsupervised learning. We analyze the feasibility of applying subspace clustering---a specific type of unsupervised learning — to high-dimensional, heterogeneous sensory input. Then we present the correspondence between clustering output and classification input. This approach has the potential to discover implicit, evolving activities, and can provide valuable assistance to traditional classification based methods.

activity recognition, dimension, subspace, (12 more...)

AAAI Conferences

Workshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York > New York County > New York City (0.05)
Asia > Middle East > UAE > Dubai Emirate > Dubai (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.92)

Add feedback

KiWi: A Scalable Subspace Clustering Algorithm for Gene Expression Analysis

Griffith, Obi L., Gao, Byron J., Bilenky, Mikhail, Prichyna, Yuliya, Ester, Martin, Jones, Steven J. M.

arXiv.org Artificial IntelligenceApr-13-2009

Numerous studies have used coexpression of large expression datasets to infer functional associations between genes [1], to identify groups of related genes that are important in specific cancers or represent common tumour progression mechanisms [2], to study evolutionary change [3], for integration with other large-scale datasets [4][5], [6], and for the generation of high-quality biological interaction networks [7][8][9] [10]. A number of studies have also attempted to use coexpression to identify coregulation with the hypothesis that if two or more genes are expressed at the same time and location and at similar levels then they may be regulated by the same transcription factors and regulatory elements. This approach has shown promise particularly in simpler model organisms such as A. thaliana and S. cerevisiae [11] [12][13] [14] and many groups are currently working on implementing this idea in mammalian systems. However, traditional clustering methods have not worked particularly well on large datasets for this problem. Most methods assign each gene to only one cluster while in reality many genes likely take part in multiple processes. Also, global coexpression is measured across all conditions, whereas, it is probable that most genes are only tightly coregulated under certain conditions or locations. In recent years, a new field of clustering analysis termed subspace clustering (or biclustering) has gained increasing popularity in the analysis of gene expression data and other biological data [15][16][17][18] [19]. In contrast to traditional clustering methods such as hierarchical clustering, subspace clustering methods do not require expression to be correlated across all conditions for genes to be assigned to the same cluster. This has several advantages for data in which biologically relevant subsets exist (e.g.

artificial intelligence, dataset, machine learning, (16 more...)

arXiv.org Artificial Intelligence

0904.1931

Country: North America (0.28)

Genre:

Research Report > New Finding (0.47)
Research Report > Experimental Study (0.30)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Filters

Collaborating Authors

subspace cluster

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification

The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification

Creating user stereotypes for persona development from qualitative data through semi-automatic subspace clustering

Improving the Effectiveness and Efficiency of Stochastic Neighbour Embedding with Isolation Kernel

A Generic Framework for Interesting Subspace Cluster Detection in Multi-attributed Networks

The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification

Greedy Feature Selection for Subspace Clustering

Incorporating Unsupervised Learning in Activity Recognition

KiWi: A Scalable Subspace Clustering Algorithm for Gene Expression Analysis